-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28836 Parallize the file archival to improve the split times #6616
base: master
Are you sure you want to change the base?
Conversation
@virajjasani @stoty I am giving it another try. I hope this doesn't fail this time. I am running tests locally and they haven't finished yet. But i am optimistic this time. 🤞🏾 |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I still see some failures that are not the usual flakies. |
c9022c1
to
4bd65e7
Compare
4bd65e7
to
d3b481d
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@stoty No test failure this time. Please have a look. Let me know if i am missing something here. |
hbase-server/src/main/java/org/apache/hadoop/hbase/backup/HFileArchiver.java
Show resolved
Hide resolved
Queue<File> failures, String startTime) { | ||
LOG.trace("Archiving {} files concurrently into directory: {}", files.size(), baseArchiveDir); | ||
|
||
ExecutorService executorService = Executors.newCachedThreadPool(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most such similar pools are configurable.
Have you configured making the thread pool configurable ?
Would it make sense to use a global pool here, and limit the number of concurrent move operations ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering we are shutting down the threads after execution is it okay if we give some valid constant rather than a configuration? I am of the opinion that one more configuration would not help us. I also understand that having a max cap on number of threads is an important aspect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I don't have a lot of experience with object storage deletion performance.
Are resolveAndArchive calls serial, or is it possible to have multiple invocations running at the same time ?
What do you think @wchevreuil, @BukrosSzabolcs ?
No description provided.